An Alternate Version of the Conceptual Predictive Statistic Based on a Symmetrized Discrepancy Measure
نویسندگان
چکیده
The conceptual predictive statistic, Cp, is a widely used criterion for model selection in linear regression. Cp serves as an estimator of a discrepancy, a measure that reflects the disparity between the generating model and a fitted candidate model. This discrepancy, based on scaled squared error loss, is asymmetric: an alternate measure is obtained by reversing the roles of the two models in the definition of the measure. We propose a variant of the Cp statistic based on estimating a symmetrized version of the discrepancy targeted by Cp. We claim that the resulting criterion provides better protection against overfitting than Cp, since the symmetric discrepancy is more sensitive towards detecting overspecification than its asymmetric counterpart. We illustrate our claim by presenting simulation results. Finally, we demonstrate the practical utility of the new criterion by discussing a modeling application based on data collected in a cardiac rehabilitation program at University of Iowa Hospitals and Clinics.
منابع مشابه
Discrepancy-Based Model Selection Criteria Using Cross Validation
A model selection criterion is often formulated by constructing an approximately unbiased estimator of an expected discrepancy, a measure that gauges the separation between the true model and a fitted approximating model. The expected discrepancy reflects how well, on average, the fitted approximating model predicts “new” data generated under the true model. A related measure, the estimated dis...
متن کاملModel Diagnostics for Bayesian Networks
Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students’ knowledge and skills, are no exception. This paper employs the posterior...
متن کاملDiscrepancy-based algorithms for best-subset model selection
The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the...
متن کاملObject and Action Naming: A Study on Persian-Speaking Children
Objectives: Nouns and verbs are the central conceptual linguistic units of language acquisition in all human languages. While the noun-bias hypothesis claims that nouns have a privilege in children’s lexical development across languages, studies on Mandarin and Korean and other languages have challenged this view. More recent cross-linguistic naming studies on children in German, Turkish,...
متن کاملA Moving Avarage Variation Control Chart based on Bayesian Predictive Density
Recently several control charts have been introduced in the statistical process control literature which are based on the idea of Bayesian Predictive Density (BPD). Among these charts is the variation control chart which we refer to it as VBPD chart. In this paper we add the idea of Moving Average to VBPD chart and introduce a new variation control chart which has all advantages of the ...
متن کامل